Skip to content

Conversation

felixbarny
Copy link
Member

@felixbarny felixbarny commented Aug 5, 2025

Optimizes IP field parsing in the following ways:

  • Leverages XContentParser#optimizedTextOrNull to avoid the UTF-8 to Java String conversion overhead.
  • Avoids the expensive ipString.split(":") in favor of a more efficient algorithm that iterates over all character bytes only once.
  • Reduces memory allocations by avoiding a roundtrip through InetAddress. This requires creating an ESInetAddressPoint class that's similar to Lucene's InetAddressPoint as the latter can only be constructed via an InetAddress.

The semantics are kept in tact as-is and all IP parsing related test are still passing.

This could potentially hurt the performance of code paths that don't have access to an UTF-8 encoded byte array (but have a String) as the String will now need to be converted to a byte array first. However, these code paths don't seem to be as performance sensitive from a first glance.

@felixbarny felixbarny added >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types :StorageEngine/Mapping The storage related side of mappings labels Aug 5, 2025
@elasticsearchmachine elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 5, 2025
@martijnvg
Copy link
Member

This looks a nice improvement. Do you have an idea how this impacts indexing performance? For example by running metricsgenreceiver or test workload with just IPs?

@felixbarny felixbarny marked this pull request as ready for review August 11, 2025 09:06
@felixbarny felixbarny requested a review from a team as a code owner August 11, 2025 09:06
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Aug 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@felixbarny
Copy link
Member Author

Based on profiles I captured for 30s of a run of metricsgenreceiver before and after the optimizations, there are about 60% fewer samples for InetAddresses.forString(String) with this optimization (588 vs 1433 samples). In addition to that, optimizedTextOrNull is a little more efficient than textOrNull (229 vs 257 samples). With CBOR (after #132542), this gets significantly faster (115 samples). It's difficult to compare the full IpFieldMapper#parseCreateField time before and after because the run with the optimizations also included #132566.

@felixbarny felixbarny requested a review from romseygeek August 15, 2025 06:53
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice improvement.

Just to double check that I'm reading it correctly: for parsers that don't support optimizedText(), this will still end up doing just a single String-to-bytes conversion in the parser itself, right?

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the changes to InetAddresses.

Copy link
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR would greatly benefit from some JMH benchmarks to show the differences and guide some development choices (e.g. specialized functions for String vs byte[])

@felixbarny
Copy link
Member Author

I think this PR would greatly benefit from some JMH benchmarks to show the differences and guide some development choices (e.g. specialized functions for String vs byte[])

The approach I took here was to conduct metic ingestion benchmarks and analyzing cpu and allocation flame graphs to have a better understanding of the real-world impact outside of narrow microbenchmarks.

@felixbarny
Copy link
Member Author

Sorry, I think I misinterpreted your suggestion. It makes sense to compare the performance of the String-based methods before and after this change to see what the difference is. It may actually not be a regression because of the other optimizations. Let's see.

@felixbarny
Copy link
Member Author

I've added benchmarks and compared the before and after.

To summarize, the throughput is higher in all scenarios after the changes proposed in this PR. The only regression is that there are a more allocations when handling IP4v addresses.

                                                             Before (main)           After (this PR)
Benchmark                                (size)   Mode  Cnt       Score      Error       Score      Error   Units
encodeAsIpv6WithIpv4                       1000  thrpt    3                          25912.287 ±  425.865   ops/s
encodeAsIpv6WithIpv4:gc.alloc.rate.norm    1000  thrpt    3                          32000.027 ±    0.001    B/op
encodeAsIpv6WithIpv6                       1000  thrpt    3                           6748.165 ± 1197.371   ops/s
encodeAsIpv6WithIpv6:gc.alloc.rate.norm    1000  thrpt    3                          32000.104 ±    0.017    B/op
forStringIpv4Bytes                         1000  thrpt    3                          22505.172 ±  306.497   ops/s
forStringIpv4Bytes:gc.alloc.rate.norm      1000  thrpt    3                          80000.031 ±    0.001    B/op
forStringIpv6Bytes                         1000  thrpt    3                           6190.543 ± 1989.384   ops/s
forStringIpv6Bytes:gc.alloc.rate.norm      1000  thrpt    3                         152000.113 ±    0.037    B/op
forStringIpv4String                        1000  thrpt    3    18724.122 ± 345.067   22477.031 ±  209.926   ops/s
forStringIpv4String:gc.alloc.rate.norm     1000  thrpt    3    80000.037 ±   0.002  111992.031 ±    0.002    B/op
forStringIpv6String                        1000  thrpt    3     3356.420 ±  93.202    5582.589 ± 3434.972   ops/s
forStringIpv6String:gc.alloc.rate.norm     1000  thrpt    3   696000.209 ±   0.035  208000.125 ±    0.079    B/op
getIpOrHostIpv4                            1000  thrpt    3    18902.280 ± 352.800   22516.581 ±  476.120   ops/s
getIpOrHostIpv4:gc.alloc.rate.norm         1000  thrpt    3    80000.037 ±   0.001  112000.031 ±    0.001    B/op
getIpOrHostIpv6                            1000  thrpt    3     2056.111 ±  11.173    3186.665 ±  106.916   ops/s
getIpOrHostIpv6:gc.alloc.rate.norm         1000  thrpt    3  1104000.343 ±   0.049  616000.221 ±    0.035    B/op
isInetAddressIpv4                          1000  thrpt    3    25479.685 ±  66.369   25844.380 ±  178.405   ops/s
isInetAddressIpv4:gc.alloc.rate.norm       1000  thrpt    3    24000.027 ±   0.001   56000.027 ±    0.001    B/op
isInetAddressIpv6                          1000  thrpt    3     3513.745 ± 965.823    6981.567 ± 1365.645   ops/s
isInetAddressIpv6:gc.alloc.rate.norm       1000  thrpt    3   576000.200 ±   0.074   88000.100 ±    0.019    B/op

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@felixbarny felixbarny merged commit 6dae011 into elastic:main Aug 19, 2025
34 checks passed
@felixbarny felixbarny deleted the ip-parsing-optimization branch August 19, 2025 15:16
@ldematte
Copy link
Contributor

I've added benchmarks and compared the before and after.

Thanks! Looks very good indeed!

@felixbarny felixbarny self-assigned this Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types :StorageEngine/Mapping The storage related side of mappings Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:StorageEngine v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants